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ABSTRACT 

For  simple  random  sampling  (without  replacement)  from  a 
finite  population,  suitable  stochastic  processes  are  constructed 
from  the  entire  sequence  of  jackknife  estimators  based  on  func- 
tions of  U-statistics  and  these  are  approximated  in  distribution 
by  some  Brownian  bridge  processes.  Strong  convergence  of  the 
Tukey  estimator  of  the  variance  of  jackknife  U-statistics  has 
also  been  established.  Some  applications  of  these  results  in 
sequential  analysis  relating  to  finite  population  sampling  are 
also  considered. 
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1.  INTRODUCTION 

Let  ft.,  be  a finite  population  of  size  N,  represented  by 
N 

the  vector  A^  = (a^  .....a^)  of  real  numbers.  Let  XN  = 

(X... X.,.,)  be  a random  vector  which  takes  on  each  permutation 

N 1 NN  j 

of  the  elements  of  A^  with  equal  probability  (N!)  . Then  a 

random  sample  of  size  n(<  N)  drawn  without  replacement  from  ftN 
may  be  represented  by  X^n  = (X^ > • • • » xNr)) > so  X^n  takes 
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on  al  1 possible  n-tuplcs  (ak,.  ....  ,a  ),  i < i * . . . * i < N, 

1 Nil  Nin  1 n 

with  the  common  probability  N’ln^{(=  N...(N-  n+1)}"1),  for 

n = 1.....N. 

tor  a sitmrnetric  kernel  f(Xv,  ) of  deqree  m(>  1),  the  U- 

~Nm 

statistic,  defined  by 

v»<g-'wip  f<xNt, ‘Ni'-"2"’  «•» 

n ,m  1 m 

(where  P = { (i . , . . . ,i  } : 1 < i * . . . * i s n}) , is  an  unbiased 
n,m  1 nr  1 m 

estimator  of 

1'N“U(iNN>*U(V-N'WipM  f(aNi, aNi  > • (1'2) 

N ,m  1 m 

Various  properties  of  have  been  studied  by  Nandi  and  Sen 

(1963),  Sen  (1970,  1972)  and  others. 


Let  us  consider  a real -valued  function 

0N  = g(VV  > 


(1.3) 


where  g is  a smooth  function.  Though  U is  an  unbiased 

estimator  of  0k,  = g(Uk,  ) is  not  generally  unbiased  for  0 . 

N Nn  Nn  N 

For  this  reason,  we  consider  the  following  jackknife  estimator . 
Let  ( ‘ \ r i 

f(XNV”XNi  } • (1’4) 

P , 1 m 

n-1  ,m 

where  P1  , = {(i,,...,i  ):  1 s i,  * . ..  * i <n  with  i.  * i, 

n-l,m  1ml  m j 

1 < j < m} , for  i = 1 , . . . ,n . Also,  let 

• lsisn ; (1-5) 


0.,  . = n0.,  - (n-l)0.,  \ , 1 <i  <n  ; 

Nn , l Nn  ’ Nn-1 


(1.6) 


i 11 

= n"1  \ (L  . . 

i=l  Nn*X 


(1.7) 


Then  0*n  is  the  jackknife  estimator  of  0^. 

For  random  sampling  from  an  infinite  population,  jackknifing 
of  U-statistics  has  been  studied  by  Arvesen  (1969).  Recently, 

Sen  (1977)  has  carried  the  investigation  further  by  establishing 
invariance  principles  for  jackknife  statistics  and  incorporating 


b 


i 


them  to  some  problems  in  sequential  analysis.  The  object  of  the 
present  investigation  is  to  extend  the  results  of  Sen  (1977)  to 
sampling  from  finite  population  and  to  emply  them  in  some  prob- 
lems of  survey  sampling. 

The  basic  assumptions  and  preliminary  notions  are  outlined 
in  Section  2.  Section  3 deals  with  the  main  theorems  and  their 
derivations.  In  Section  4,  some  applications  and  clarification 
of  certain  results,  which  have  so  far  been  tacitly  assumed  by  the 
workers  in  this  field,  are  discussed. 


2.  PRELIMINARY  NOTIONS 


As  in  Sen  (1972),  we  define  for  h:  0<hsm, 

fh(xi. xi  > ■ iN-h)'l”Lhif(Xi, xi  > 


(2.1) 


where  V,.  , extends  over  all  1 < i.  ,*...* i <N  with  i,  . * 
‘■(h)  h+1  m h+j 

i for 

s 

Also,  let 


i for  i = 1 , . . . ,m-h  and  s = 1 , . . . ,h.  Then  f.  = p,,  and  f = f . 
s 1 ' O N m 


fh.N  * Var(Wf 


Vhf*‘"Nir 


•*Ni  > 
h 


N ’ 


(2.2) 


for  0<h£m,  where  £q  n=0  and  it  follows  from  Nandi  and  Sen 
(1963)  that 


0 < Ch>N  s (h/g)Cg  N , V 1 <h<g<m  . 


(2.3) 

For  the  study  of  asymptotic  properties,  we  conceive  of  a 

sequence  {ft)  of  populations  and  allow  N ■+■  00  . We  assume  that 
N 


(A)  inf  r > 0 and  sup  £ < 

N J>N  N m-N 

(B)  sup  H I f (XNm) 1 4 < “ 

N 


(2.4) 

(2.5) 

and  (C)  g,  in  (1.3),  has  a bounded  second  derivative  in  some 
neighborhood  of  p . Note  that  the  second  condition  in  (2.4) 
fol lows  from  (2.5). 

Note  that  by  (3.22)  of  Nandi  and  Sen  (1963),  V N^n>m, 

n'lm1M/(N»"]}'l,NSV<UNn>  5"'‘-{nStK,N  • <2'6> 
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< CO 


(2.7) 


so  that  by  (2.4),  for  m<n<N(l-£),  e > 0, 

0 < inf  nV(U  ) s sup  nV(U  ) 

N N 

while  nV(U  ) + 0 as  n -*•  N.  Further,  by  (2.4)  and  (2.5), 
Nn 


max  (|f.(a  ) -p  |/UbNl}=0(N1/4)  , as  N 

l<i<N 


Let  us  also  define,  as  in  Sen  (1972), 


c • "■"’V.y’s v ’ 04hsn' 


“hcrv  UJ"’  ■ “n  and  UNn 


= u V n^m,  and  let 
Nn 


W 


(h)  = 
Nn 


k=0 

Then,  as  in  Sen  (1972), 

U 


(-l)kU^'k)  , h = 0, 1 , . . . ,m 


m 


Nn 


- I 


h=0 


.(h) 

%n 


(2.8) 


(2.9) 


(2.10) 


(2.11) 


(Jiven  that  t lie  collection  (a  . a , ) corresponds  to  a 

Nil  N1n 

sample  X^,  (without  specifying  the  order  in  which  the  elements 

occur),  X can  assume  any  one  of  the  n!  possible  permuta- 
~Nn 

tions  of  (a,.  ,...,a...  ) with  the  same  conditional  probability 
Nil  Nin 

(n!)  . The  conditional  expectation  with  respect  to  this  condi- 


tional distribution  is  denoted  by  E(*lcNn)-  Then, 

c<§Nn-llW  * ' V n>“  ' 

and,  as  a result,  by  (1.5)-(1.7)  and  (2.12), 

dNn  = ^Nn  + * Vn>n' 


(2.12) 


(2.13) 
is 


The  Tukey  estimator  of  the  variance  of  n 2(0*n- uN) l (N-n) /NJ  2 

V.,_  - (n-1)-'  I e:,y  , (2-14) 


Nn 


i=l 


Nn , i Nn 


where  by  similar  arguments  if  follows  that 


VNn  = n(n'1)  Vart(§Nn'  V-l)|CNn}  * 


(2.15) 


Both  (2.13)  and  (2.15)  are  in  agreement  with  the  parallel  results 
for  infinite  populations,  treated  in  Sen  (1977).  We  conclude  this 
section  with  the  following. 


Definition:  Let  (T  , n<N,  N < l}  be  a double  sequence  of  eta- 
Nn 

tistics  and  {a , N > l}  be  a sequence  of  real  numbers.  Then, 

N 

T -a  strongly  converges  (s.c.)  to  0,  if  for  every  e > 0 
Nn  N 

and  every  sequence  {N*(£  N) } , such  that  N*  00  as  N ■*  00 
(but  N*/N  may  or  may  not  go  to  0), 

pL?xJt«"'c‘n|  j 4 ■ ° ■ <2i6) 

vN*<n<N  ' 

We  shall  find  this  definition  useful  in  the  subsequent  sections. 

3.  INVARIANCE  PRINCIPLES  FOR  {0*  } 


Let  { N* } be  a sequence  of  positive  integers  such  that  as 
N > ",  N*  -*■  <»  but  N*/N  ■*  0 (viz . , N*=N,0<X<1  of  log  N 

etc.).  We  then  consider  the  stochastic  process  = (Z^(t) , 
t e I = [0,1]}  by  letting  t 


N,k 


N 

k/N , k = 0 , 1 , . . . ,N  and 


ZN(t)  = 1 


fo  , t < N*/N  , 

Z 


Nk  " k(NVNk}  vuNk  'uNy  ’ ~ wN,k  * 

-t: ) zNk  + (t-l)ZNR+1  , t ftN  k+1 


(6Nk  - V ’ 


(3.1) 


Z^  = (Z^(t)  t c 1}  be  a standard  Brownian  bridge  on  I.  That 
Z^  is  a Gaussian  function  with  EZ^(t)  =0  and  F.Z^(s)Z^(t)  = 


for  k <N.  Then,  Z„  has  a continuous  sample  path  and  it  belongs 

N 

to  the  space  C ( 0 , 1 ] of  continuous  real-valued  functions  on  I. 
Let 
i s _ 

s a t - st  * min(s,t)  - st , Vs, tel.  We  say  [viz  . , Billingsley 
(1968)]  that  Z converges  in  law  (or  distribution)  to  Z°  if 

N k 

for  every  continuous  functional  h(*)  assuming  values  in  R , 
the  k(>  1) -dimensional  Euclidean  space,  as  N -*■  00  , h(Z^)  has 

For  example, 


asymptotical ly  the  same  distribution  as  of  h(Z) 

the  above  weak  convergence  of  Z to  Z^  insures  that 

N 19 

sup{ZN(t):  Ostsl),  sup{|ZN(t)|:  0£t<l}  and  fQ Z^(t)dt  have 
the  same  limiting  distributions  as  of  sup{Z^(t) : 0 st  s 1}, 
sup{|z°(t)|:  Ostsl}  and  [Z°  (t ) ]2dt , respectively.  This 

mode  of  convergence  is  stronger  than  the  asymptotic  normality 

u 

of  n 2(0*  -0)  and  it  also  insures  that  for  finitely  many 
Nn 


t. t tall  belong  to  1),  [Z  (t .),...  ,Z  (t  ) ] has  asymp- 

i m N l N m 

totically  a multinonnal  distribution.  Then,  we  have  the  follow- 
ing. 

Theorem  3.1.  Under  Assumptions  (A),  (B)  and  (C)  of  Section  2, 

Z converges  in  Lau  to  Z^. 

Before  we  present  the  proof  of  the  theorem,  we  consider 
several  results.  First,  the  following  theorem  (whose  proof  is 
postponed  to  Section  5)  is  of  basic  importance  in  this  context. 


Theorem  3,2.  Under  (2.5), 
s . c . , as  N ->•  00  . 

Note  that  n(n-l)E{(U 

(>  (n-l){  max  (U^  -U  ) 
' ll<i<n 


|n(n.l)B((uNn.i-uN„)2|CN„}-m2cli|(|-0 

Nn-fV^V  ' <"-»  j/C-f' 

^l] , so  that  by  (2.5)  and  Theorem  3.2, 


max  max  (n-lfU^.-U  ]2  = 0(1),  in  probability  . (3.2) 

N*<n<N  l<i<n  Nn_1  Nn 


Further,  by  Theorem  1 of  Sen  (1970),  ^ ^ s>c>*  an<^  hence, 


by  (3.2), 


max  I UNn-  1_UN I ° s,c,>  as  N ^ 

l<i<n 


(3.3) 


Let  us  now  define 

= [g'(PN)l2m251>N  * N " 1 


(3.4) 


Then,  by  virtue  of  Theorem  3.2,  and  (3.2)  and  (3.3),  we  may  vir- 
tually repeat  the  steps  in  the  proof  of  Theorem  3.1  of  Sen  (1977) 


and  obtain  that  under  Assumptions  (B)  and  (C) , 

2 


VNn  ‘ YN 


0 


s.c. 


as  N * 


-1. 


G*  -6..  = 0(n  ) in  the  strong  sense  in  (2.16) 

Nn  Nn  & 


(3.6) 


Let  us  now  return  to  the  proof  of  Theorem  3.1.  Suppose  that 

2 

in  (3.1),  we  replace  ^ by  yn  and  denote  the  resulting 

process  by  z|j  = {Z^(t),t  e I).  Then,  by  definition, 


p(ZN,zJ)  = sup | ZM(t)  - Zjj(t)  | 


tel 


N 


N' 


■ (iVvNk  (trl|izN(t)i>  ('■ 


ill.  (3.7) 


il  0 

Hence,  i I ....  non  verges  weak  i>  to  Z ( which  implies  that 

sup{  1 Z,!  ( t ) I : t • I)  u (lj  ),  then,  hy  (3.5)  and  (5.7),  p(Z.,,z!,j 

l)  N N 

l)  1 

I)  a N ' ' . lienee,  it  suit  ices  to  prove  the  following 

Theorem  3 ,3, Under  t • ■ thei  J Tt  ?rem  3 , 1 , Z,  . In 

law  bo  Z . 


Proof. We  note  that 


\n 


- 8(V 

lu  - pJ 


* 8"(hlVn+(1_h)V'  f3‘* 

where  hy  Assumption  ((.'),  g"(.)  is  bounded  in  some  neighborhood 

i 2 

of  pv.  A1  io,  by  theorem  1 of  Sen  (1970),  n2(U^  -p^J  ► 0 s.c., 


and  hence, 


nN  *|{0Nn-  ■;..)  - g'(MN)[UNn-  ...J  i - 0 s.c.  (3.9) 

P 

Suppose  now  that  in  (3.  1 j , we  replace  0 ^ - ; and  I*  by 
g'(p  )(U  ,,-  p j and  y , respectively,  and  denote  the  resulting 
process  by  Zv  {Z^(t),t  t I }.  Then,  by  fheorem  2.1  of  Sen 
( 1972 ) , Z.  converges  in  law  to  Z<J , while  by  (3.6)  and  (3.9), 
along  with  the  weak  convergence  of  2 , 


p(ZN,ZN)  * 0,  (3.10) 

and  hence,  z['  converges  weakly  to  . (J.h.lJ. 

N 

So  far,  w have  assumed  that  the  a,  are  all  real  numbers . 

There  is  no  harm  in  letting  them  be  real  p-vector  , for  some 

p --  1 . The  same  permutation  argument  holds  in  this  case,  and 

hence,  the  proofs  remain  unaltered.  Also,  in  practical  problems, 

U.  ( I J.  , , , . . . ,11,  , J (for  some  u ' 1 ) may  be  a q- vector, 

Nn  Nn ( 1 j Nn  (q)  — 

where  the  I).,  , are  defined  by  (1.1)  for  kernels  f,  , of  degree 
Nn ( j ) /vi  (j)  b 

m (_>  1)  and  in  the  same  framework,  we  have  p^  = U , & - 
and  ti,  g(IJ..  j,  where  we  assume  that  g has  bounded  second 
order  partial  derivatives  in  some  neighborhood  of  y . Then, 
replacing  in  (1.5),  IJ^^  by  (defined  as  in  (1.4)  with 


(f,. , f, 

n ) ( q j 


we  <If* fine  fiic  jackknife  estimator  0*  a1 

Nn 


mi  ( l . (1 ) ( i . 7)  . If  we  define  The  f 


;js  in  ( 2 . 1 ) for  f - f 


and  replace,  in  (2.2),  f.  by  f,  . ,(X...  jf,„  . (X,.,  J,  the  result- 

h ( j j h "Nil  U J h "Nil 

i it}’  pliant  ity  is  denoted  by  ^ ^ for  j ,£=],..., p and  ii  >0. 

I hen,  assuming  that  (2.4)-(2.5)  iiold  for  each  j(=l,...,q),  it 
follows  by  argument s very  similar  to  the  ones  in  the  proof  of 
Theorem  3.2  (see  Appendix!  that  for  the  jackknife  variance  V , 


V - Y 
Nn  N 


0 s . c 


(3.11) 


(3. 12, 


hi  t h this  modification.  Theorems  3.1  and  .3.2  iiold  under  no  extra 

regular  i t.y  condition.  Actually,  (.3.2)  and  (3.3)  hold  coordinate- 

wise  for  each  j(  =l,...,q),  in  (3.8)  we  have  a multivariate 

econd  order  I ay  lor  series  expansion,  so  that  in  (.3.9)  (and  in 

the  definition  of  Z^j  , g ' ( (If  .)  has  to  be  replaced  by 

, i- ' (u.i  (II.  p ) which,  being  a linear  combination  of 

|/*J  .1  ~N  Nn  ( j ) N ( j ) 

li  u,  , i ■ it  elf  a l)-stat  i stic , and  hence,  Theorem  2.1  of  Sen 

■ -Nn  ~N 

(1972)  holds  d i rect 1 y . 

It  is  also  possible  to  considei  a vector  ()  = g(U^)  and 

N gin.  J,  where  g(*)  - (g,  .(•),...  ,g,  f*J ) for  some 
- -.a  *-  —Nn  *-  ( l ) ( r J 

r 1 and  IJ.  = (If  , , , . . . ,1).,  , ) , for  some  u -*  1 . In  such 

~Nn  Nn  ( 1 ) Nn  (<| ) 

a case,  we  assume  that  for  each  g,  and  If,  , the 

(sj  Nn ( j ) 

Assumptions  ( A ) , (11)  and  ((!)  of  Section  2 are  met.  Defining  the 

Cl  »*  <'  *>  f°r  t • (f„, f,„)  and  = fi^Nn-l5* 

I i - n , the  jackknife  estimator  0*  is  again  defined  tiy 

(l.b)-(  1.7).  The  Tukey  estimator  of  t.iie  di  .persion  matrix  of 


n ‘‘(0*  -0,,)  is  given  by 
~Nn  "N 


V.,  - (n-1)  j (0.,  .-0*  )’(0.,  .-6*  ) . 

~Nn  ..f;.  ~Nn,i  ~Nn  ~Nn , i ~Nn 


(3.13) 


bet  us  consider  the  matrix  £ = ( (y  s,))  whero,  for  every 

s.s'M r)  , 

9 <i 


(3. 14) 


a nil  )' J ^ is  th<-  partial  derivat  ivc  of  ^ with  respect  to 

the  i tli  argument,  for  i I , . . . ,<|  and  l,...,r.  'I  hen,  by  a 

dii'-ri  coord  i natewi  so  extension  of  ( 3 . I I j (>.12)  w have 

V„  - I’  - 0 s.c.  (3.  15) 

"Nn  ~N 

Consider  then  a vector -valued  stocastic  process  7 Z(t),  t • I I 

~N  "N 

where  Z ft)  is  defined  |as  in  (3.1)|  by  linear  interpolation  of 
~N 

Z , , 0 ■ k • N and 
~Nk  ’ 


•Nk 


kN'VNK2(0;k-0NJ  i 


0 , otherwise  . 


-Nk 
and  N 


i s pos it  i vc  de f i n i t e 
k - N , 


( 3 . 1 b ) 


.0 


{Z(,(tj,  t<  1)  be  a vector-Causs i an  function 


I i na 1 1 y , let 

o/i  I with  LZ**  ( t ) =0  and  1;  (Z<f  f t ) J 1 ( 2 * ( s ) ) = {(sAt)-stH, 

V s,  t r 1,  where  1 is  the  unit  matrix  of  order  r.  Thus,  the 
0 ~r 

components  of  Z are  all  independent  Brownian  bridges  ori  1. 
Then,  we  have  the  following. 

Theorem  3.4.  Under  tkn  nonditionu  mentioned  above,  Z converges 

0 . 

in  lam  to  Z , whenever  f vs  p.d. 

Outline  of  the  proof.  Let  us  define  z[*  = f Z ft),  t f l(,  by 


replacing  in  (3.16)  V . by  V , Then,  by  arguments  similar 

~Nk,dt)  ~N 

t o t hose  in  (3.7), 


„ .,0 

f ’ ( ’ ~N 


t-  r 


TNxNkN„)Oru  lU^ilh 


(5. 17) 


l'  2 9 

where  | |a|  | ^ 'ii)1. * ) 2 ;jnd  | ;A|  | = trace  of  A".  Hence,  by  vir- 

tue of  (3.15),  it  suffices  to  show  that  Z * * Z(>.  Towards  this, 
we  consider  a direct  (coordinatcwi se)  extension  of  (3. 8) -(3. 9) 
(with  modifications  .as  in  after  ( 3 . J 2 ) j , and  the  rest  of  the 
proof  follows  by  using  a direct  (vector-)  extension  of  Theorem 
2. 1 of  Sen  (1972) . 

4.  APPLICATIONS 


We  conceive  of  a random  sample  (x^  j of  size  n drawn  with- 
out replacement  from  where  the  a^.  (and  hence,  X^.J  are 


Ni 


ill  p- vector’  . lor  some  p i.  We  consider  the  following 
app  1 i < it  i oris  . 


i.l.  Estimation  and  resting  of  Multiple  Regr<  . jo-  Coefficient; 


Let  ir,  denote  the  population  dispersion  matrix  by 

■ .17,  , , - 1 " 

•'  / (a.,  .-a,  J ' (a.  -aM)  ; a . / a... 

'■ , ~Ni  ~N  ~M  ~N  ~N  N '• , ~N  i 

i=l  i=l 


( (A  )) 

Nrs  r,s= 


{ 4 . 1 J 

and  denote  the  minor  of  / >, 


(ij 


Nrs 


We  write  A 

by  A,  , r,s=l,...,p.  Also,  we  denote  X,  by  (X.'.! J , . . . ) , 

-Nrs’  1 S i \ i Ni 

i 1 N.  ’1  hen  the  population  regression  coefficients  of 

X [ ['  ^ on  (X.f  1 > , . . . ,X,f^  J,l)f=  X*.,  sayj  are  given  by 
,i  N l N l ~N  l 


fS, 


. -1 

= A..  A' 


where  /*  = (i 


~N  k\pp~Np  ~Np  v'  Npl  ’ ’ ' ' ’ '’Npp-  1 1 ’ 

Ihe  usual  sample  estimator  of  f based  on  X.^J  is 

b , = L.  1 Z*  where  Z*  - (Z  . . ,Z  J’ 

~Nn  ~Nnpp~Nnp  ~.\np  Nnpl  .npp-1 


I-,.  = (( Z )),  L.  is  the  minor  of  L, 

~hn  Nnrs  ~.\nrs  N 


and 


- f 21  - 

1,  = T)  1 1 / 

~.Nn  '-l^iAj-rn  Ni  ’"Nj 


Nnrs 

UK,  ,**:) 


and 


£(xNi*V  = % (hi-W'(xmW 


(4 . 2 J 

(4.3; 

(4.3; 

(4.4) 


is  a matrix  of  order  p*p.  Though  L,.  is  unbiased  for  A.., 

~\n 

~Nn  is  not  necessar > 1/  so.  But  bNn  is  a matrix  of  U-statistics, 
and  we  are  natrually  tempted  to  use  jackknifing  to  reduce  the  bias 
of  b ^ ; the  jackknife  estimator  as  defined  in  (1.7)  |and  before 
(3.13;)  is  denoted  by  b*p.  Using  the  results  of  MacRae  (1974), 
we  have 


9 b, 


Nn 

9L, 

~Nnpp 


(k, 


-1 


0 


-1 


....  ...  I ,}E(p-l,p-l){L'  « I (4.5) 

-Nnpp  "^p-l  ~ 1 1 ~Nnpp  p-1  ^np  ’ 

where  (4  denotes  the  direct  product, 


Ii(p-l,p-l)  = f(E.  )).  (of  order  (p-1  )2  * (p-1)2) 

ij  i » J“i  » • • • » r" * 


and  jLi , j is  a (p-1)  * (p-1)  matrix  with  the  element  1 in 
its  (i,jj  and  (j.ij-th  positions  and  0 elsewhere,  for 
i , j = 1 , . . . ,p-l . Also, 


(4.6) 


(4 .7  j 


!Nn',>>'  I-1 

Ihu.,  if  w<  assume  that  (i)  tin-  characteristic  roots  of  A.  are 

r 

all  bounded  away  from  0 and  then  condition  (C)  of  Section 

some  that 


2 holds,  while  (2.f>)  holds,  if  in  addition,  wi 
N 


im  N 1 ) |a^  j-a(^  1 8 < « , V j 1 p 

i I 


(4 . 8 J 


l.et 


Ik  the  jackknife  dispersion  matrix  of  n^fb*,  -ft.,), 
•Nn  J 1 ~Nn  --N 

ih  lined  by  fv».l  .4).  The  n from  Theorem  4.4,  we  claim  that  whenever 

n(-  N)  is  large,  T = nN(N-n)  * (b*  -ft.J'V  (b*  -ft.)  has  asymp- 

Nn  "-Nit  ~N  ~Nn  -Nn  -N  ' 

totically  tin-  chi-square  distribution  with  p- 1 decrees  of  free- 
dom (hi  ) and  this  can  he  incorporated  in  testing  a null  hypothesis 


V eN 


o 


(specified)  against  II*:  ft^  / ft 


( ) 


or  to  provide  a 

2 


£ 


i. 


Nrt 

or  - 


he  defined  by  replacing  ftM  by  ft 

2 


0 


(simultaneous)  confidence  region  lor  ft,,.  J.et  y(  be  the 
uppei  lOOas  point  of  the  chi-square  distribution  with  t III 
and  let 
l inn  of 

Nil  7 

where  a(0  < rx  < I ) 


Then,  we  accept  or  reject  11,,  according  as  'I 


in  the  defini- 
T0 

i„  according  as  I,,,, 
o NN 

is  the  desired  level 


p-  J ,oi 

ol  significance  of  the  test.  further,  if  we  consider  the 


(4.9) 


I e | 1 i pso i da  1 ) reg i on 

Nn  2n‘  * Nn  ^p-l,rx 

tin  lor  large  n,  I.,  provides  a confidence  region  for  IT,  with 

Nn  1 ~N 

confidence  coefficient  1 - u. 


ri  actual  practice,  neither  V nor  its  population  counter- 


pa  1 1 


~N  ’ 


-Nn 

defined  hy  (4.14),  is  usually  known  in  advance,  and 


In  nce,  an  improper  choice  of 


may  result  either  in  excessive 


costing  (due  to  over-samp J ing)  or  in  a larger  diameter  of 


„0 


Nn 


| or  small  power  of  the  test  based  fin  T”^!  (due  to  under-sampling) 
for  this  reason,  a sequential  procedures  may  he  adapted  which 
through  regular  updating  of  information  through  sampling  results 
in  (marly)  optimal  solutions.  These  sequential  procedures,  in 
turn,  rests  on  the  invariance  principles  considered  in  Section  4. 
l.et  clt  (ft)  he  the  largest  characteri  st  i root  of  ft  and  let 


n * = in  i n { k : 
d 


k>n°  and  ch1  (y.^)  < d2kN[ (N-k)x^  j J _1>  , (4.10) 

where  d(>  0)  is  a preassigned  ("small)  positive  number  and 
n<f(>  p)  is  an  initial  sample  size  with  which  sampling  commences. 

Thus  n*  is  a positive  integer  valued  random  variable  and  n 

d 0 
^ n*  < N.  Then,  starting  with  the  sample  size  n , units  are 

drawn  (one  by  one)  without  replacement  so  long  as  k < n * i.e., 

2 2 a 
chi  ^~Nk)  > ^ N^/(N_1)Xp  j a-  When  k = n^,  sampling  is  terminated 

and  I , defined  by  (4.9)  for  n = n*,  is  taken  as  a (simultane- 
d 

ous)  confidence  region  for  8^.  Note  that  if  A be  any  p.d. 
matrix,  then  by  the  Schwarz  inequality 

sup{|£'x|:  £'£=l}  = sup{  l^'A^A'^xl  : Vl=l} 

< [supU'A/:  L't  = 1}(\'A"^x)]!5 

= [chj (A)  (x'A_1x)]^  . (4.11) 


Hence,  choosing  A = V 


-Nn* 


and  x = (b 


'hV 


(3^) , we  obtain  from 


(4.9),  (4.10)  and  (4.11)  that  the  maximum  diameter  of  I*  * 

c 


1 S 


(4.12) 


so  that  the  width  of  the  confidence  interval  for  any  £'8^  is 

bounded  by  ( VI ~)'2  2d,  i.e.,  I„  * is  a bounded-diameter  confi- 

~ ~ rin  j 

a 

dence  region  of  We  intend  to  show  that  a d is  chosen  small, 

(4.13) 


P{&Ne  WjW  1"a  ’ 


insuring  that  the  confidence  coefficient  of  1^  * approaches 

d 

1-a  when  d is  chosen  small.  Towards  this,  we  define 

n^  = minfk:  k > n^  and  ch^ (r^)  ^ d2kN/ (n-k)x^_ ^ • (4.14) 

Then,  by  (3.15),  (4.10)  and  (4.14),  we  have 

nVnj  "*■  1 s.c.,  as  d + 0,  (4.15) 

d a 

and  as  a result,  using  the  continuity  (or,  rather,  the  tightness) 


property  of  Z [implied  by  the  convergence  in  law  of  Z to 

/-a  i i ft  n N 


*n*)  - Z (N  *nj)  ^ 0,  while  by 
a ~N  u 


Zf)| , we  obtain  that  Z (N 
-1  0~N 

Theorem  3.4,  Z^(N  n^)  has  asymptotically  a multivariate  nor- 
mal distribution.  Combining  this  with  (3.15),  we  conclude  that 

TNn*  ' TNn«  0 s<c-*  as  d 1 0 ’ (4.16) 

d d 

and  using  (4.16)  along  with  the  fact  that  T^  q *ias  asymptotically 
the  chi-square  distribution  with  p - 1 DF,  w$  conclude  that 


P,SN(,Nn*ly  ’ P,TN„.2Vl.oISN1 


PtTNnf>V-l.al&l) 


1-a  as  d 1 0 . 


(4.17) 


Thus,  (4.13)  holds.  The  theory  developed  here  is  an  extension  of 
the  Chow-Robbins  (1965)  theory  of  fixed-width  confidence  inter- 
vals to  finite  population  sampling  and  to  a more  general  class  of 
statistics. 

In  medical  trials,  often,  repeated  significance  tests  (RST) 
are  made  on  an  increasing  sequence  of  sample  sizes  with  a view  to 
stopping  earlier  if  a significant  result  is  obtained  at  that  time 
(prior  to  reaching  the  target  sample  size).  Here,  we  shall  deve- 
lop such  RST  procedures  for  sampling  from  a finite  population. 

The  theory  rests  on  the  invariance  principles  studied  in  Section 

TNk  “ (k2/N^feSk-fiN>'VNitfeSk-eNJ  forN*SkSN.  (4.18) 


where  N*  -►  00  but  N”aN*  -*• 
V k < N* . Let  then 

Sin  ' “x  TNk 
k^n 


as  N -*■  °°,  while  we  let  T^  = 0, 


and  M..  = N~*7.  T.  , 

Nn  tk<n  Nk 


where  n/N  -*■  vc  (0,1].  Then,  we  have 

(P;1 


2 osdiL")0'1’} = ^p'1)  - say 

j n 


(4.19) 


(4.20) 


(t)dt  = W. 


(P-D 


say 


where  (W^(t),  tc  1}  are  independent  copies  of  a standard 


£ 


Brownian  bridge  on  I.  Let  K, 


(p-1) 


v,a 


and 


M^P  be  the  upper 

r iV*a  r n 

100a°<>  point  of  the  distributions  of  ' and  , respec- 

fll  r i ■)  v v 

tively.  [For  p = 2,  1C  and  ’ are  functionals  of  a single 


Brownian  bridge  and  their  distributions  are  known;  see  Koziol  and 
Byar  f 1 975 ) and  Pettitt  and  Stephens  (1976).  For  p^3,  analy- 
tical solutions  appear  to  be  intractable;  however,  the  prospect 
of  simulation  is  quite  bright.  We  may  refer  to  Majumdar  (1976) 
for  some  related  work,] 

Suppose  now  that  we  desire  to  test  H • (3  = 0 (specified) 

„0  U ~N  ~0 


against 


SN  * So- 


Let  T, 

.0 


Nk 


be  defined  by  (4.18)  when  0^  is 
(and  ) be  defined  by  (4.19)  when 


replaced  by  and  M"n 

is  replaced  by  k>l.  Then,  we  have  the  following  RST 

procedure : 


0 


Continue  sampling  as  long  as  k<n  and  T ^ (or 


,(p-l) 


v,a 


(or  M 


(P-1) 


M°  ) 
NkJ 


is 


•I’L  > K 
J)0  v,a 


(P-D 


v,a 

(°r  <-ND 

observed,  along  with  the  rejection  of 


).  If,  for  the  first  time,  for  k = D (<  n). 


m!!„  i M^^),  stop  sampling  when  X 


ND 


is 


H 


O' 


If,  no  such  D(<  n) 


exists,  stop  sampling  at  the  preplanned  n-th  stage  (i.e.,  when 
XNn  is  observed),  along  with  the  acceptance  of  H . 

By  (4.9)  through  (4.13),  we  conclude  that  the  asymptotic 
level  of  significance  of  this  test  is  equal  to  a.  Also,  E(D)<n, 
indicating  a saving  in  the  average  amount  of  sampling  over  the 
fixed-sample  size  procedure.  In  fact,  we  may  even  test  for  a more 
general  hypothesis: 

vs-  «:  CJ3n  x 0Q  , (4.22) 


V = £o 


where  C 


is  a qx(p-l)  matrix  of  rank  q(l<q<p-l)  and  0^ 

For  this  case,  in  (4.18),  we  need  to 

.-1, 


is  a specified  q-vector 

,-1,2 


TNk  = N k ^Nk£,-^)(^Nk£')  ^k-So^  k"N*  (and  e(lual 


take 

to  0 for  k<N*),  and  in  (4 . 20) - (4 . 21 ) , we  need  to  change  p-1 
to  q.  Rest  of  the  sequential  procedure  remains  the  same. 


4..?.  Estimation  and  jesting  of  Rat  io  of  Means . 

In  the  same  set  up  of  Section  4.1,  we  consider  the  parameters 


PN(ij)  * a|S’)/0|iJ)  ’ f°r  ' 5i<JSp  ' 


(4.23) 


and  our  interest  centers  around  one  or  more  of  these  ratios.  The 


usual  estimator  of  p 


g = X ( * ) /jK  ^ 

PN { i j ) Nn  ' Nn 


= n'1  V X... 

. Ni 
1 = 1 

/ v"(  1 ) v(Ph 

1 Nn  ‘ ' ’Nn  ’ ’ 


(4.24) 


for  1 <i  <j  <p.  Though  XNn  is  a l)-statistics  (vector)  and  is 

unbiased  for  a,,,  p,,,...  is  not  generally  an  unbiased  estimator 
~N  N ( l j ) 

of  ^ . Hence,  jackknifing  may  be  employed  to  reduce  the 

bias.  Here,  we  note  that 

0/9b.)(b1/b2)  = (-1) i_1 (b1/b2) 1_1b”1  for  i = 1 ,2  . (4 . 25 ) 

Hence,  we  may  proceed  as  in  (4.9)  through  (4.22)  and  consider 
point  as  well  as  confidence  interval  estimators  of  P^jij), 

1 <i  < j <p  and  also  (sequential  or  nonsequential)  tests  for  any 
subset  of  these  parameters.  In  passing,  we  may  remark  that  the 
Hr i zzle-Starmer-Koch  (1969)  linear  models  with  categorical  data 
extend  to  the  situation  when  proportions  are  ratios  and  samples 
are  drawn  without  replacement  from  finite  populations,  and  where 
jackknifing  is  employed  for  bias  reduction.  This  is  actually  a 
special  case  of  (4.23)  when  each  of  the  p arguments  of  a . is 


either  0 or  1,  so  that  a 


is  the  proportion  of  1's  in 


the  N responses  on  the  i-th  characteristic,  for  i = l,...,p. 

4.3.  Optimal  A 1 location  in  Stratified  Random  Sampling 

Suppose  that  the  population  of  N units  is  subdivided  into 

r(>  2)  sub -populations  of  sizes  N^ Nf  (so  that  N = Nj  ♦ 

...  + N ).  Let  av1.  and  be  defi  ad  as  in  (4.1)  for  the 

r ~Nh  ~Nh 

h-th  sub-populat i on , h = l,...,r.  Suppose  that  a sample  of  size 
n is  to  be  drawn  and  let  n^,...,nr  denote  the  sub-sample  sizes 
for  the  r strata.  Optimal  allocation  of  nj,...,nr  [viz.. 


Chapter  5 of  Cochran  (1963)]  depends  on  A , ....A  , which  are 

all  unknown.  We  may  start  with  an  initial  sample  of  size  n^ 

(=  rn())  with  n()  observations  from  each  stratum,  estimate  the 
A 1 <h<r  and  using  these  estimates  get  an  estimated  optimal 
allocation  for  n;  this  usual  practice  entails  some  loss  of  effi- 
ciency. As  in  Williams  and  Sen  (1973),  we  may  consider  a multi- 
stage (or  sequential  procedure)  where  we  keep  on  updating  the 
estimators  of  A^,  1 <h<r,  so  that  the  procedure  will  be  asymp- 
totically optimal.  In  this  context,  jackknifing  can  also  be  used— 
the  theorems  studied  in  Section  3 insure  that  for  jackknifing,  the 
sequential  procedure  leads  to  an  asymptotically  optimal  allocation. 

5.  PROOF  OF  THEOREM  3.2 


We  consider  here  the  proof  of  Theorem  3.2.  Note  that  by  (2.11), 

'V,  - “n  ■ ""in’  * ; *C„  ■ I • <5-‘> 


^h;  Nn 


for  every  N^n>m.  Hence,  for  every  n>m, 

U,.  , - UM  = m(wfj1^  - W.(.1J)  + (W*  , - W*  ) . (5 .2) 

Nn-1  Nn  Nn-1  Nn  ' Nn-1  Nn 

By  virtue  of  (2.5),  (2.9),  (2.10)  and  the  results  of  Nandi  and 

Sen  (1963),  it  follows  by  some  routine  steps  that 

E([n(n-l)(Wjn  l -W*n)2]2}  s cn'2  , Vm<n<N,  (5.3) 

where  c does  not  depend  on  N.  Hence,  for  every  N*(s  N) : 

N*  -*  °°  . t „ \ 


’ P{  , 
'•N 


.ax  -^n)2lCNn]  > =} 


5 I PUlnCnUKK^.,-!^)  |CN„]>e) 

n=N 

5 fe-2E[n(n-l) -«;n)2]2 

n=N 

2 ^ ^ 2 1 

< 2e  l n < ce  (N*-l)  -*  0 , Ve>0, 

n=N* 


(5.4) 


Hence,  to  prove  the  theorem,  it  suffices  to  replace  n(n-l) 
^Nn-l~UNn^2  m2n(n-l)  (w^„^j-w^^)2-  Toward  this,  not  that 


‘ n("-')-1Er[»‘;,-(f1(XNn)-|JN)l2|CNn) 


= n(n-l)  1 [n_l  £ (f^.)  - MN>‘  - {W^J}*]  . (5.5) 

i=l 

Note  that  (w/^  , C , n>m}  is  a reverse  martingale  (being  a 

iiil  INTI  m 2 

U-statistic  sequence)  and  E{W.,  } = £,  ..((N-n) /Nn) . Hence,  by 

Nn  1 ,N 

Theorem  1 of  Sen  (1970),  V e > 0 

p{  max  [W^1J]2>4  * e'1^  {(N-N*)/NN*> 

^N*<n<N  Nn  > 1,N 

->  0 as  N*  ->  00  . (5.6) 

~ - 1 ^ 2 

On  the  other  hand,  lJNn  = n £ {f  (X^)  -PN}  (n  = 1)  is  also 

~ _ i = l 

a U-statistic  with  EU^n  = N and  by  (2.5)  and  by  (3.14)  of 
Nandi  and  Sen  (1963),  V{U^  } = 0(n  *(N-n)/N),  V n^N.  Hence, 
by  Theorem  1 of  Sen  (1970),  we  have  for  every  e>0, 


idh2. 


p{  maX  ^Nn  '?1  J >e}  - £"2v(GNN*J 


-*•0  as  N*  -*■  00 


(5.7) 


from  (5.5),  (5.6),  and  (5.7),  it  follows  that  |n(n-l)E{(w'  - 

t \ \ y 

W ')  |C,  } - C,  x,i  0 s.c.,  and  the  proof  of  the  theorem  is 

complete . 

Remark.  For  infinite  populations,  a similar  result  has  been 
proved  by  Bhattacharyya  and  Sen  (1977) . In  view  of  the  rela- 
tively stringent  assumption  (2.5),  for  finite  population  sampling, 
the  present  proof  is  considerably  simpler  in  nature. 
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